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(54) Inaudible insertion of information into an audio signal 

(57) A method of inaudibly inserting information Into an audio signal transforms the audio signal (e.g. by 
FFT) into a succession of frequency spectra over successive relatively long intervals, detects a spectral signal 
peak above a given threshold for each frequency spectrum, and adds one or more test tones to the masking 
area of the audio signal adjacent the spectral signal peak, the test tones having a predetermined characteristic 
relative to the spectra! signal peak and representing a reference signal, test signal or data, to produce a 
transmission audio signal. The transmission audio signal is decoded by converting the transmission audio 
signal into a succession of frequency spectra over successive relatively long intervals, detecting a spectral 
signal peak for each frequency spectrum, searching for an associated spectral component (test tone) for each 
spectral signal peak, and decoding the associated spectral component to recover the data. Fig. 4 (not shown). 
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INAUDIBLE INSERTION OF INFORMATION 
INTO AN AUDIO SIGNAL 

BackerounH »f t he Invenfinn 
The present invention relates to signal insertion, and more 
particularly to inaudible insertion of infonnation into an audio signal so 
that it is inaudible to humans while being recoverable by a rsceiving 
system. 

The phenomenon of auditory masking in humans is well known 
and discussed in an article by Eberhard Zwicker and U. Tilmam. Zwlcker 
- IsmnalofAudioEn^ Vol. 39. No. 3. March 1991 

entitled "Audio Engineering and Psychoacoustics: Matching Signals to the 
Final Receiver, the Human Auditory System", incorporated herein by 
reference. This effect is a current technology being exploited in audio 
signal compression by removing parts of the audio signal that humans 
cannot hear, thereby reducing the amount of information being 
transmitted. 

In many instances it is desirable to insert a signal representing some 
information that a receiver may want to use. such as a test signal, a 
reference signal or data, into another information signal. An example is 
-the insertion of a vertical interval test signal (VITS) into the vertical 
interval of a television video signal. VITS is inserted into a portion of the 
television video signal that is not displayed to a viewer, so it is 
transparent to the viewer. However it is not apparent how an infonnation 
Signal could be inserted into an audio signal since there are no "non- 
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visible- ,„ tt,. .„™,p<,„<„„^ ,„ „^ ^^^^^ 

the television video signal. 

What is desi:.d is the Insemon of an Infcnnation sigx^al into an 
audio signal in a manner that is inaudible to a human. 



Summary »f ft.^ TnTrntinii 
Accordingly U,, p„..„, 
si.n.1 i„..«„„ ^^^^ ^ ^^^^ 

info™.„„„ „ 

r.Si.n.. .T.e encoding .e .y .he p„»nc. o. ,.»„ce of an in.e„,a 
.on. o, .one. .Ke en.pii,ude of a,e ,„.e„ed .one o. .one. «.e pH.o of 
.nse„ed .one or .one. or con.b,n.,o„ o,«.e.e encoding ,eoH„i,„e.. 

or .on,., i, decoded ,o «co.er .he oHginal in«,„„a,ion .ign.,. 

The objec. adv„..ge. and otter novel ,.a.u„. of to p.e.e„. 
.nvan.ion a„ .ppa„„, ^„ ^, ,„„^„,^^ ^^^^^^^ ^^^^^^^^^^ ^^^^ ^ 

in coniuncion „iu, u,e appended Cain, and a.Uched drawing. 

Brief n^rripn^^ of th» nrn,.^n^ 
Fig. 1 is a block diagram of an encoder for inaudible audio signal 
insertion according to the present invention. 

Fig. 2 is a flow chart diag^ of a method of inserting infonnation 



into an audio signal according to the present invention. 

Fig. 3 is a block diagram of a decoder for inaudible audio signal 
insertion according to the present invention. 

Fig. 4 is a flow chart diagram of a method of extracting information 
from an audio signal according to the present invention. 

Descrintion of the Preferred Embodiment 
Referring now to Figs. 1 and 2 an audio signal, suitably digitized, is 
input to a digital signal processor (DSP) 12. and also stored in a memory 
structure 14 configured as a delay elerhent. The DSP 12 performs a 
frequency domain transformation, such as a fast Fourier transform (FFT), 
on the digitized audio signal repetitively over a relatively long interval of 
the audio signal, such as milliseconds up to one second or more 
depending upon the particular application, converting the audio signal 
from the time domain to the frequency domain. For example for speech 
applications the interval may be long, while for music the interval may be 
short, so long as the interval is long enough to trap a relatively large peak 
while not getting too many peaks. 

The output from the frequency domain transformation is a 
succession of frequency spectra. Over the interval the largest amplitude 
spectral component for the associated frequency spectrum is searched for 
to determine the signal peak. The signal peak is compared with a 
minimum threshold level, such as -18 dB. The minimum threshold is 
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calculated based upon the amplitude of the inserted signal, the inserted 
signal recoveiy technique, and the signal to noise (S/N) ratio required at 
the receiver for the inserted signal. The minimum threshold may be 
precalculated for a given application and stored in the DSP 12. If the 
signal peak is less than the minimum threshold level, then the next 
interval is processed to obtain a new frequency spectrum. If the signal 
peak is above the minimum threshold and of sufficient duration, a tone is 
inserted into the audio signal in a masking area around the signal peak. 
The frequency and amplitude of the tone are based on the masking 
characteristics of the signal peak. The duration and shape of the tone are 
designed to maximize the energy in the test tone and minimize the enei^^ 
not at the test tone frequency. For example, if the test tone is at a 
frequency ten percent (100^) higher than the signal peak, the shape of the 
test pulse is a sine-squared bar. the 100% amplitude duration is exactly 
three cycles of the test tone, and the amplitude of the test pulie is set to - 
40 dB. then if the duration of the signal peak is less than that calculated 
for the test tone, the test tone is not inserted and the algorithm is started 
again over the next interval. 

As shown in Figs. 2 and 4 the input to a decoder 20 is the encoded 
audio signal in digital fonn. which signal may have undergone several 
processes that have changed the level of the entire signal. The encoded 
audio signal is input to a decode DSP 22 where again a frequency domain 
transformation is performed repetitively over a long interval of the input 
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signal corresponding to that used at the transmitter, giving successive 
frequency spectra. Again the largest amplitude spectral component in the 
inteival is searched for. the signal peak. For the frequency of the signal 
peak an associated spectral component is searched for that was inserted by 
the.transmitter. i.e.. a spectral component with the correct frequency o&et 
and pulse width. For reliability the pit,cess may be repeated over 
successive intervals to assure that the expected pulse is found successfully 
a few times in succession. For repeating over several intervals the audio 
interval(s) may be stored in a random access memory 24 when the DSP 22 
is not fast enough. If the inserted pulse is not found, which is possible 
since the largest signal peak may not be the same one the encoder found 
due to differences in timing between the encoder and decoder, the decoder 
slides the interval window along in time until the signal peak found by 
the encoder is also found by the decoder. Once this synchronizaUon of 
decoder with encoder is completed, then decoding occurs continuously. 
The detected pulse is then measured for amplitude, phase, etc. and. for 
example, the overall input signal is adjusted to the correct level based on 
the measured amplitude of the test pulse for an automatic gain control 
(AGQ application, or the detected pulse is otherwise decoded for its 
information content. For example in an AGC application if the pulse is 
detected at -32 dB. this means that the signal level needs to be reduced by 
8 dB since the transmitted pulse in this example was inserted at -40 dB. 
The decoder may also remove the transmitted test pulse from the output 
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signal if desired. Of course most audio compression techniques based 

masking will remove the transmitted test pulse. 

Thus the present invention provides for insertion of inaudible 
audio signals into an audio signal at a transmitter by inserting defined 
audio tones in masking regions of the audio signal, and then extracting the 
defined audio tones at a receiver, which extracted audio tones are 
decoded. 



WHAT IS CLAIMED IS: 



1. A method of inserting inaudible audio signals into an audio signal 
comprising the steps of: 

repetiUveiy perfonning a frequency domain transformation on the 
audio signal over successive intervals of a first predetermined duration to 
produce successive frequency spectra; 

finding the largest amplitude spectral component within each 
frequency spectrum to determine a transmission signal peak; 

for each transmission signal pe^ above a given threshold adding a 
test tone into the audio signal with predetermined characteristics relative 
to the transmission signal peak, the test tone representing data, to produce 
a transmission audio signal. 

2. The method as recited in claim 1 further comprising the sjteps of: 

performing the frequency domain transformation on the 
transmission audio signal over successive intervals of a second 
predetermined duration to produce frequency spectra; 

searching each frequency spectrum for a received signal peak; 

for each received signal peak searching for the test tone; 

sliding the successive intervals of the second predetermined 
duration in time and repeating the searching steps until the test tone is 
detected; and 
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decoding'the test tone to recover the data represented by the test 

tone. 

3. The method as recited in claim 1 further comprising the steps of: 
performing the frequency domain transformation on the 

transmission audio signal over successive intervals of the first 
predetermined duration to produce frequency spectra; 

searching each frequency spectrum for a received signal peak; 
for each received signal peak searching for the test tone; and 
decoding the test tone to recover the data represented by the test 

tone. 

4. A method of inserting inaudible audio signals into an audio signal 
substantially as herein described with reference to and as shown in the 
accompanying drawings. 
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